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industrial applicability 
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derives from the fact that they are composed of a list of vague desiderata, rather 
than a clear combination of technical features having a well defined inter-working 
relationship. Indeed the entire specification, notwithstanding the applicants' 
arguments to the contrary, also consists of a vague conflation of ideas loosely 
related to desirable features of a video conferencing system, many of which are 
already well known in the art and the remainder thereof relating to routine 
measures normally to be expected of a skilled person without the exercise of any 
inventive step, whether considered alone or in any particular combination. 

2 The two figures accompanying the specification are also devoid of any real 
technical content, representing as they do, the vaguest of schematic 
representations of a central service provider linked to a plurality of subscribing 
stations, in a manner commonly known in prior art proposals for e.g. interactive 
TV/Internet services. 

3 Thus the entire application appears to lack any clearly patentable matter. 
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the supportive arguments presented, it now appears that the claims all lack either 
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wherein, in a multi-location television conference system that connects five 
locations A, B, C, D, and E, when speeches take place at the four locations A, B, 
C, and D at the same time, at a listening location E, images of all the speaking 
locations A, B, C, and D are displayed on one screen with four divided screen 
areas. On the other hand, at the speaking location A, images of the speaking 
locations B, c, and D and an image of the former speaking location E are 
displayed on one screen with four divided screen areas. In addition, when images 
of speaking locations are displayed, locations names thereof are also displayed. 
Thus, a television conference held at a plurality of locations at a time can be 
smoothly managed as with a real conventional conference; and 

D2: WO 98 23075 A (UNISYS CORP) 28 May 1 998 - 

in which a hub is disclosed for muitimedia multipoint video teleconferencing. 
The hub has a plurality of input/output ports, each of which may be coupled to a 
communication channel for interchanging teleconferencing signals with remote 
sites. The hub includes a plurality of signal processing functions that can be 
selectively applied to teleconferencing signals so that the signals distributed by 
the hub are In a form desired by the recipients or required by their 
equipment. Signal processing may include video, data, graphics, and 
communication protocol or format conversion, and language translation. 
The hub may generate data relating to the use of its processing capabilities during 
a teleconference, so that accounting or billing for a teleconference may be based 
at least in part on which hub resources were used, the extent of their use, and the 
person desiring their use. The Identification of a signal processing function to 
be used during a teleconference may be automatically performed in 
response to the content of signals received at the hub during the 
teleconference. 

Beyond this state of the art, the minor linguistic or technical differences of the 
claims and/or the described embodiments, so far as these can at present be 
understood, do not appear to relate to any essential technical feature of an 
invention and therefore are not considered to imply an inventive step. 

Therefore, it does not appear that any aspect of the application could serve 
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A further disadvantage of the currently viable 
videoconf erences, is given by the fact that it is not 
possible to superimpose titles, subtitles, 

abbreviations, speakers' names, musical themes and 
soundtracks, and all audio and video effects that can 
make of a '"flat" and static videoconf erence a real 
television programme. 

In this respect, it is useful to observe that said 
problems and drawbacks have not only got purely 
aesthetic consequences, but they also cause a rapid 
decrease in the level of attention of the attendants, 
which is an extremely important factor for the success 
of a conference of whatever type. 

A first aim of the present invention is that of 
allowing the course of videoconf erences (congresses, 
debates, presentations, lectures, etc.) with the 
utilization of audiovisual contributions such as films, 
slides, photographs, animated computer aided design, 
graphs, music and/or soundtracks etc, 

A second aim of the present invention is that of 
guaranteeing an orderly and fluent course of a 
videoconference, thanks to the audio-visual 

commutations carried out by the operators of the 
direction room, and by the possible presence of a 
chairperson, who is meant to allow the user to 
personally take part in the debate, only at the most 
suitable moment, 

...A "^.^^^"^ .^.f- the present invention is that of 

giving the possibility to attend a conference even to 
Internet users. Furthermore, through a series of 
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CLAIMS 

1. Process for carrying out and managing 
videoconf erences among remote and/or local users, 
characterised by the fact that it comprises the 
following steps: 

- link-up to a direction room (1) with a plurality of 
both remote and neighbour locations (2), which a signal 
of the audio video type (AV) originates at; 

- if necessary, conversion of the audiovisual signal 
(AV) from each location (2), before its transfer from 
the place where it was generated to that where the 
direction room is located (1), so as to make it 
suitable to the type of connection and transmission 
which are being utilised; 

- Reconversion of the signal (AV) which has been 
received, if this is necessary, into the audio video 
format, before its arrival at the direction room (1); 

- Selection of the signal or signals to use and send 
away to the attendants and the speaker respectively, by 
an input audio video matrix (MVl) ; 

- Addition of the contributions and the necessary audio 
and/or video effects, as well as of titles, 
soundtracks, comments, images, graphs and so on, by a 
video mixer or a computer having similar functions; 

- Selection of the processed audio video signals (AV) 
and sending thereof to the several remote locations 
(2), according to the role that the users who are there 
located play at that moment (i.e. attendants or 
speakers) . 

2. Process according to claim 1, characterised by 
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suitable audio signal (Al, A2 , . . , r An) has been 
associated, . i.e. the one, that corresponds to the 
translation required by the user. 

5 6. Process according to the preceding claims, 

characterised by the fact that more than one user can 
receive the same audio video signal (AV) . 

7. Process according to the preceding claims, 
10 characterised by the fact that it provides for the 

recording of the audio video signal for the purpose of 
archive or else, so well as it is actually seen by the 
attendants, that is enriched with the audiovisual 
contributions and the television effects that have. been 
15 added, by a suitable videotape recorder {VD2) that 
receives the output signal of a video mixer (MIX) or 
computer with similar functions . 

8 . Apparatus for carrying out and managing 
20 videoconf erences between remote and/or neighbour users, 

characterised by the fact that it comprises a plurality 
of remote and/or neighbour user-locations (2), of the 
interactive or multimedial type which are linked to a 
direction room (1) which exchanges a signal (AV) of the 
25 analog and/or digital audiovisual type with them. 

9. Apparatus according to claim 8, characterised 
by the fact that said signal (AV) contains a series of 
information relative to the conference and the speaker 

30 or the speakers that are scheduled to talk, as well as 
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A further disadvantage of the currently viable 
videoconf erences, is given by the fact that it is not 
possible to superimpose titles^. subtitles, 

abbreviations, speakers'' names, musical themes and 
soundtracks, and all audio and video effects that can 
make of a ^'flat" and static videoconf erence a real 
television programme . 

In this respect, it is useful to observe that said 
problems and drawbacks have not only got purely 
a.esthetic consequences, but they also cause a rapid 
decrease in the level of attention of the attendants, 
which is an extremely important factor for the success 
of a conference of whatever type. 

It is also Known, from EP-A-061967 9, a multi- 
location television conference system that connects, 
five locations A, B, C, D, and E, when speeches take 
place at the four locations A, B, C, and D at the same 
time, at a listening location E, images of all the 
speaking locations A, B, C, and D are displayed on one 
screen with four divided screen areas. On the other 
hand, at the speaking location A, images of the 
speaking locations B, C, and D and an image of the 
former speaking location E are displayed on one screen 
with four divided screen areas. In addition, when 
images of speaking locations are displayed, locations 
names thereof are also displayed. Thus, a television 
conference held at a plurality of locations at a time 
can be smoothly managed as with a real conventional 
conference . 

A first disadvantage of this conference system is 

3 


that it does not allow the connection among systems 
having different transmission protocols, different type 
of signals or different technologies. 

A second disadvantage of EP-A-0619679 is that said 
limitations prevent the simultaneous connection and use 
of quite different transmission channels such as 
satellite, computer network, telephone lines, internet, 
and so on. 

A third disadvantage of EP-A-0619679, is that the 
information that can be displayed on the screen of each 
user, by superimposition with the images of the most 
recent speaker/s, are very limited and require the use 
and the creation of identification codes . 

A fourth disadvantage of EP-A-0619679 is the i it 
does not provide means for substitute the audio signal 
of one or more user with the audio signal coming from a 
simultaneous-translation room that translate, in real- 
time, the discourse of the speaker in that of the user. 

Another disadvantage of the system disclosed in 
EP-A-0619679 is that the switching of the images 
displayed is submitted to the detection of an audio 
signal. 

A first aim of the present invention is that of 
allowing the course of videoconf erences (congresses, 
debates, presentations, lectures, etc.) with the 
utilization of audiovisual contributions such as films, 
slides, photographs, animated computer aided design, 
graphs, music and/or soundtracks etc. 

A second aim of the present invention is that of 
guaranteeing an orderly and fluent course of a 


3bis 


videoconf erence, thanks to the audio-visual 

commutations carried out by the operators of the 
direction room^ and by the possible presence of a 
chairperson, who is meant to allow the user to 
personally take part in the debate,^ only at the most 
suitable moment. 

A third aim of the present invention is that of 
giving the possibility to attend a conference even to 
Internet users. Furthermore, through a series of 



CLAIMS 

1. Process for carrying out and managing 
videoconf erences among a plurality of users locations 
suitable to receive and transmit audio-video signals 
and located at whatever distance, using whatever 
communication protocol, characterised by the fact that 
it comprises the following steps: 

- link-up a direction room (1) to a plurality of both 
remote and neighbour locations (2) , where a signal of 
the audio video type (AV) is originated; 

- conversion of the audiovisual signal (AV) from each 
location (2), before its transfer from the place where 
it was generated to that where the direction room is 
located (1), so as to make it suitable to the type of 
connection and transmission which are being utilised; 

- Reconversion of the signal (AV) which has been 
received into the audio video format, before its 
arrival at the direction room (1) ; 

- Selection of the signal or signals to use and send 
away to the attendants and the speaker respectively, by 
an input audio video matrix (MVl) ; 

- Addition of the contributions and the necessary audio 
and/or video effects, as well as of titles, 
soundtracks, comments, images, graphs and so on, by a 
video mixer or a computer having similar functions; 

- Selection of the processed audio video signals (AV) 
and sending thereof to the several remote locations 
(2), according to the role that the users who are there 
located play at that moment (i.e. attendants or 
speakers) . 

2. Process according to claim 1, characterised by 
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suitable audio signal (Al, An) has been 

associated, i.e. the one that corresponds to the 
translation required by the user. 


6. Process according to the preceding claims, 
characterised by the fact that more than one user can 
receive the same audio video signal (AV) 

7. Process according to the preceding claims, 
characterised by the fact that it provides for the 

( recording of the audio video signal for the purpose of 

7^ archive or else, so well as it is actually seen by the 

y attendants, that is enriched with the audiovisual 

M= contributions and the television effects that have been 

^2 added, by a suitable videotape recorder (VD2) that 

receives the output signal, of a video mixer (MIX) or 
y% computer with similar functions . 

4;: 8 . Apparatus for carrying out and managing 

5; videoconf erences among a plurality of users located at 

whatever distance and using whatever communication 
( protocol, characterised by the fact that it comprises a 

plurality of remote and/or neighbour user-locations 
(2), of the interactive or multimedial type which are 
linked to a direction room (1) which exchanges a signal 
(AV) of the analog and/or digital audiovisual type with 
them. 

9. Apparatus according to claim 8, characterised 
by the fact that said signal (AV) contains a series of 
information relative to the conference and the speaker 
or the speakers that are scheduled to talk, as well as 
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PROCESS FOR CARRYING OUT VIDEOCONFERENCES WITH THE SIMULTANEOUS INSERTION OF AUXILIARY 
INFORMATION AND FILMS WITH TELEVISION MODALrUES 


10 


15 


20 


25 


DESCRIPTION 

The present invention relates to the field of 
multimedia communications, and more particularly a 
process and apparatus therefor for videoconf erences 
that provides link-ups among several attendants and 
with extremely variable characteristics and modalities, 
adaptable to any specific need of the user. 

Currently, multiple user videoconf erence apparati 
and techniques are known, and despite being based on 
different execution parameters, they make the choice of 
the image to be shown to the attendants on the grounds 
of the audio signal coming from the attendants 
themselves, which is technically called "audio 
presence" , 

In other words, the sound received by the 
microphone located at every equipped location gets to 
the centralised videoconf erence management device . 
This device shows all the attendants the image of the 
user that has generated the sound impulse. In such a 
way, all the attendants receive the image of the person 
that is speaking at that precise moment of time on 
their screns. It is therefore clear that if two or 
more users speak at the same time, the conference 
management device carries out image commutations on a 
continuous basis, causing considerable disruptions and 
chaos all along the course of the videoconf erence 
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itself . 

Attention is also drawn to the fact that a user is 
allowed into a dialogue which has already started, even 
because of a background noise from his own environment, 
5 which could be completely independent from his will but 
is detected by the microphone located at his place. 

Currently, in order to resolve such problems, it 
is necessary to turn off one's own microphone (but this 
risks turning an interesting debate into an endless 
10 monologue) . 

This type of automatic commutation caused by the 
audio presence, necessarily requires the presence of an 
interpreter next to each single attendant, in case of 
videoconf erences that involve people speaking different 
15 languages. 

Besides this, current technology does not always 
provide carrying out a link-up between different 
videocommunication systems- The apparati which are 
currently being used in fact only allow file 
20 transmission and/or sharing just in case the link-up 
devices of the several attendants are made by the same 
manufacturer, in so doing drastically limiting the 
possibilities of employment of the system itself (file 
sharing, transmission and transfer, etc.). 
25 A further problem of the prior art is given by the 

fact that the possibility of executing fadings among 
the images of the speakers that make their 
contributions along the way, and possible audio-video 
contributions, whether they be films, photographs, 
30 static images, graphs and so on, is ruled out. 
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A further disadvantage of the currently viable 
videoconf erences , is given by the fact that it is not 
possible to superimpose titles, subtitles, 
abbreviations, speakers' names, musical themes and 
5 soundtracks, and all audio and video effects that can 
make of a "flat" and static videoconf erence a real 
television programme . 

In this respect, it is useful to observe that said 
problems and drawbacks have not only got purely 
10 aesthetic consequences, but they also cause a rapid 
decrease in the level of attention of the attendants, 
which is an extremely important factor for the success 
of a conference of whatever type. 

A first aim of the present invention is that of 
15 allowing the course of videoconf erences (congresses, 
debates, presentations, lectures, etc.) with the 
utilization of audiovisual contributions such as films, 
slides, photographs, animated computer aided design, 
graphs, music and/or soundtracks etc. 
20 A second aim of the present invention is that of 

guaranteeing an orderly and fluent course of a 
videoconference, thanks to the audio-visual 

commutations carried out by the operators of the 
direction room, and by the possible presence of a 
25 chairperson, who is meant to allow the user to 
personally take part in the debate, only at the most 
suitable moment. 

A third aim. of the present invention is that of 
giving the possibility to attend a conference even to 
30 Internet users. Furthermore, through a series of 
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procedures and suitable links, which will be analysed 
in detail in the foregoing, giving the possibility to 
any single spectator who is suitably equipped to 
directly enter and take part in the conference, 
contributing to it with his own image and his own audio 
(even if not originally scheduled) . 

A fourth aim of the present invention is that of 
guaranteeing compatibility between different 

videocommunication systems, utilising the most suitable 
interfaces and transforming the ensemble of the 
videoconference into many point-point links (user- 
direction) with personalised characteristics and 
communication protocols. 

To this purpose attention is drawn to the fact 
that attendants, whether they be interactive or not, 
can be both remote and local and numberwise limitless. 

These and other aims have been accomplished 
according to the invention, by proposing a process and 
an apparatus for the production and management of 
videoconferences, wherein audiovisual signals coming 
from a plurality of remote and/or neighbour locations, 
are acquired and elaborated by a direction room capable 
of dealing with and selecting both the audio and the 
video signal, adding audiovisual contributions like 
television effects, partial or total image 
superimposition, insertion of graphs, tables, films or 
soundtracks, audio commentaries, and so on. 

According to the process and the apparatus which 
are herein described, it is also possible to provide a 
centralised interpretation service, discriminating on 
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the audio supplied by the users as a function of their 
language . 

A better understanding of the present invention 
will be gained thanks to the following detailed 
5 description with reference to the appended drawings, 
which schematically illustrate a preferred embodiment 
of the invention. 

In the drawings: 

Fig. i schematically illustrates the parts making 
10 up the direction room according to the present 
invention; 

Fig. 2 is a scheme illustrating the modalities and 
possibilities of link-up between the direction room and 
remote and/or neighbour users, by use of telephone 
15 lines, via satellite ,. via Internet, and so on. 

With reference to the abovementioned figures, the 
process object of the present invention comprises the 
following stages: 

-link-up in a direction room 1 with a plurality of 
20 remote and/or neighbour locations 2, which generate an 
audio video signal AV; 

-conversion, if necessary, of the audiovisual 
signal AV from every location, before its transfer from 
the place where it is generated to that where the 
25 direction room 1 is located, to adapt it to the type of 
connection and transmission which is employed; 

-reconversion of the received signal, if 
necessary, tc an audio-video format, before its 
entrance to direction room 1; 
30 -selection of the signal/s to be used and sent. 
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respectively, to the attendants and the speakers by an 
entrance audio-video matrix MVl; 

-addition of the necessary audiovideo 
contributions and effects, as well as of titles, 
soundtracks, commentaries, graphs, and so on, by mixer 
video MIX or computer with analogous functions: 

-selections of the processed audiovideo signals 
and their forwarding to the several remote locations 2, 
as a function of the fact that at that moment they are 
attendants or speakers. 

According to a particular aspect of the process 
described above, while the attendants receive the audio 
video signal from the speaker, the latter will be 
capable of receiving a different audio signal which has 
been selected by the direction room. 

For example, the speaker will be capable of 
receiving an overview of all the attendants or of some 
of them, just by using a device that selects the 
desired signals from the signals AV of the several 
locations and forwards them to the output audio video 
matrix for the following forwarding to the speaker. 

Moreover, the speaker might have a graph that he 
is commenting to the attendants on his own screen, and 
these are bound to receive it full screen whilst seeing 
the image of the speaker himself superimposed or 
occupying a portion of the screen itself. 

A second advantageous aspect of the present 
invention is that it is possible to send audio signal 
A coming from the speaker to an interpretation room I, 
wherein a simultaneous translation is carried out into 
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the languages required by the attendants. 

The signal that is sent to each attendant 
therefore consists of the video signal (VI, V2,..., Vn) 
ad hoc selected for him, to which a suitable audio 
signal has been associated (Al, A2 , ...,An), therefore 
corresponding to the translation required by the user. 
It is obvious that more than one user can receive the 
same audio video signal AV . 

Advantageously, according to the process that is 
herein described, it is also possible to record the 
audio video signal for an archive, just as it is 
watched by the attendants, that is with the audiovisual 
contributions and the television effects that have been 
added . 

In so far as the apparatus apt to carry out the 
process so far described is concerned, within it there 
may substantially be envisaged a plurality of user- 
locations 2 (fig- 1), which are remote and/or local, and 
of the multimedial or interactive type, possibly 
equipped with a codif ier /decodif ier , otherwise called 
CODEC, with an aggregator that transforms the analog 
audiovideo signal AV into a digital signal, and linked 
to a direction room 1 that exchanges a signal AV of the 
analog or digital audio visual type. 

Said signal AV contains a bunch of information 
relative to the conference and the speaker or the 
speakers that are given the right to speak from time to 
time, as well as other auxiliary audiovisual 
information . 

Said user-locations 2 comprise audio visual 
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input/output means, such as for example computer or 
multimedia! stations, tie-line linked-up locations, 
while the signal transmission between said locations 
and the direction room, and vice-versa, can take place 
regardless through (analog ond/or ISDN) telephone 
lines, which can themselves be aggregate or not, 
satellite transmission appliances, data transmission 
networks {including Internet), and so on. 

The signal from each remote location 2, whether it 
be digital or analog, is converted into an audiovideo 
signal, while afterwards it is sent to an audio video 
matrix MVl which deals , with all the signals and gives 
one or more output signals - 

From a strictly practical point of view, direction 
room 1 simultaneously receives signals AV from all 
users 2 connected to the video conference, and it 
further controls the audiovideo synchronism in each 
single channel and, if necessary, it suitably modifies 
it (any possible lacks of alignment can be generated by 
several components: transmission, channel aggregation, 
reconversion) . 

Signals AV coming from locations 2 are each 
visualised by a number of monitors and they are 
forwarded to audiovideo matrix MVl. 

The signals which have been selected are sent to a 
video mixer MIX, or computer with analogous functions, 
which is apt to act as an interface with a series of 
appliances like Personal Computers PC, Videotape 
recorders VDl, cameras, titlers T, audio equipment, and 
so on. 
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According to a peculiar aspect of the present 
invention, the employment of such video mixer MIX 
advantageously provides the addition to or the 
superiiripositidh onto the videoconf erence signal, that 
is the signal coming from the speaker, a series of 
audiovisual contributions such as titles, subtitles, 
musical themes, soundtracks, audio and video fadings, 
slides and/or graphs. 

Furthermore it is possible to visualise the name 
of the speaker that is talking in a certain definite 
moment, to carry out image superimpositions, to utilise 
and apply special effects and/or whatever other 
audiovisual contribution that makes the videoconf erence 
more versatile and adaptable to the needs of a specific 
15 moment . 

This means that it is also possible to 
superimpose, back up with or create effects between the 
image of the speaker and films that support his talk, 
or graphs that he is creating himself and/or changing 
in that moment, and so on. 

Advantageously, during a certain videoconf erence 
this makes it possible to emphasise moments of 
particular interest, and furthermore to underline 
relevant data during the talk, to highlight the aims to 
accomplish and/or particularly relevant news for the 
topic which is being dealt with. 

Thus, the audio video signal which has been 
elaborated by the video mixer MIX or by a computer with 
analogous functions, is forwarded to a second 
audiovideo matrix MV2 and finally to a videotape 
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recorder VD2 which records the videoconf erence . 

This second audiovideo matrix MV2, or visual 
signal sorting-out device, supplies the audio-video 
signals to be sent to each single user 2, whether they 
5 be remote or local. 

The two input and output commutation devices of 
the direction room (audio video matrices MVl, MV2 or 
analogous devices) ensure a total compatibility between 
different videocommunication systems, through said 

10 plurality of CODEC or specific interfaces, so as to 
make it possible to carry out transmissions involving 
apparati with technological features that made them 
incompatible so far. Moreover it is possible to use 
just one video matrix, if this is believed necessary by 

15 the direction room, in lieu of the two abovementioned 
ones. 

As previosly properly highlighted, another 
peculiar feature of the present invention is given by 
the fact that it is possible to capture audio signal A 

20 before it reaches output audio video matrix MV2, so as 
to make it possible to have a simultaneous translation 
by one or more interpreters into the language or 
languages used by the attendants if these explicitly 
showed a need for it or if they made a clear request to 

25 the organisation. 

In other words, audio signal A that is sent into 
interpretation room I for translation is then 
associated to video signal V at the output of the 
second audiovideo matrix MV2 in real time, in such a 

30 way that the translation or the translations are 
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listened to by all the attendants that requested to be 
supplied with such a service. 

Advantageously, according to the invention, 
direction 1 can intervene at any moment by using 
5 audiovideo matrices, substituting audiovideo signal AV 
which is forwarded to one or more remote or local 
attendants 2 with audio video signal AVR, accomplishing 
an ""intercom" type communication while the users who 
are not interested keep attending the videoconf erence 
10 without any disruptions or interferences - 

From what explained so far follows that signal AV 
which is elaborated by direction room 1 must be of the 
analog or digital audio video type: therefore the input 
and output signals, i.e directed to and coming from it, 
15 which are not audio video, must be transformed before 
their employment and finally ref transformed at the very 
moment when they are to be sent to remote attendants in 
the analog or digital form. 

These two input and output conversions at the 
20 direction room, depend on the features of the link-up 
with the remote users, once again categorisable as 
digital or analogue, which can be carried out by means 
that the user believes more suitable: analogue, ISDN or 
aggregate ISDN telephone lines, satellite transmission, 
25 computer networks (such as Internet for example), and 
so on . 

From what described so far, it appears to be 
rather clear that all the attendants to the 
videconference receive the audio video signal from the 
30 person that is speaking. Advantageously though, by 
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doubling all the incoming signals, on the speaker's 
screen there will be found to be shown the attendant to 
whom he is answering directly or with whom he intends 
to engage in a discussion, or in a cyclical fashion, 
5 that is all the participants to the conference (one by 
one or by groups, resorting to audio video multi-signal 
simultaneous combination devices) . 

To said signal which is forwarded to the speaker 
another signal can be added or substituted, this latter 
10 having been selected by the direction. 

This is accomplished by a targeted or cyclical 
selection device SR, whose output signal is exclusively 
sent to the user that is at that moment playing the 
role of speaker, or otherwise to a group of users,; this 
15 is done by resorting to the second audio video matrix 
MV2 and whatever else is believed to be most suitable 
for that purpose by the direction. 

It is useful to observe that a cyclical selection 
can take place at controllable time intervals, by dint 
20 of a timer-programmer or a computer for example. 

According to another peculiar feature of the 
present invention, the director has the possibility of 
selecting the speaker who is scheduled to talk at that 
moment and who will be shown full screen to all the 
25 other speakers and/or attendants 2. Together with 
that, it is also possible to keep the audio channel of 
all or part of the attendants 2 active, enabling the 
apparatus to automatically visualise the participants 
that take part briefly and temporarily, in the form of 
30 windows or pointers (spots) suitably placed on the 
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screen . 

Another extremely advantageous aspect of the 
present invention is the possibility of transmitting 
the videoconf erence via Internet. By suitable 

(aggregate or tie-line) connections between the 
direction room and the Internet provider, it is 
possible to broadcast the audio video signal AV of the 
videoconf erence, that comes from the audio-video output 
matrix MV2, and whatever Internet user. 

Furthermore, by a suitable discussion group, each 
single user can ask questions, show examples and 
actively take part in the debate. 

The chairperson or the person in charge of the 
videoconf erence will be capable of visualising all the 
communications of the final users or attendants, by a 
computer PCM connected to the same discussion group. 

He will be capable of ascertaining whether they 
are worth being addressed to one of the speakers that 
will be then able to answer through the channels and 
the already described modalities of the 

videoconf erence . 

If on the other hand the chairman will believe it 
suitable to personally let the Internet user UI 
contribute to the videoconf erence, direction room 1 is 
capable of carrying out an unexpected but nonetheless 
possible telephone link-up AV-UI, turning the Internet 
user UI into an actor from spectator as he was, 
offering him a chance to come and take part in the 
conference just in the same manner as that given to the 
other participants that are connected (provided that 
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said latecomer has the minimum equipment necessary for 
taking part in a videoconf erence which has the 
previously described modalities and features) . 

Advantageously, in the case of an Internet link- 

5 up, thanks to besides normal switch or ISDN telephone 
lines, the connection between the remote user and the 
provider can be carried out by dint of a mixed signal 
management system where the requests of the user are 
transmitted to the provider down the telephone lines, 

10 wkereas the audio video signal of the videoconf erence 
or of the data which have been required can be received 
via satellite, leading to a drastic improvement of 
quality and increasing the speed of reception 
regardless of the traffic on the network and of the 

15 amount of users connected to it at that very moment. 

Furthermore, using the Internet, it is possible to 
carry out transmission and data file exchange,- 
regardless of the type of data therein contained, in a 
manner which is absolutely compatible with any type of 

20 computer or computer system - 

Said remote or neighbour locations 2 may also 
comprise a camera and a microphone which are apt to 
send the audiovisual signal from a certain event like a 
parade or a sports match, to direction room 1 that is 

25 going to manage it in the most suitable manner. 

According to the present invention, it is possible 
to conduct even very ""intense" debates between a 
limited number of participants, avoiding frequent image 
changes; this is accomplished by subdividing the screen 

30 into adjacent windows and enabling the audio of the 
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entire discussion group. In this case there are found 
to be shown only those who are part of said restricted 
group of people on the screen and at the same time. 

It is useful to notice that using CODECS, it is 

5 possible to control remote cameras based at locations 
2. This means that the staff in the direction room is 
capable of showing or zooming details at their own 
discretion, by sending suitable directions that are 
bound to be executed by the camera located at the 

10 user's location. 

In particular cases, it is finally possible to 
envisage link-ups between direction room 1 and the 
users exclusively via satellite. 

The present invention can also be applied to other 

15 fields such as: conferences, training and refresher 
courses, sales, advertising, consultancy services, 
tourism and others . 

The present invention has been described and 
illustrated according to one preferred embodiment, but 

20 it holds that whoever skilled in the art may well amend 
or change it without stepping out of the scope of the 
present patent. 
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CLAIMS 

1. Process for carrying out and managing 
videoconferences among remote and/or local users, 
characterised by the fact that it comprises the 
following steps: 
5 - link-up to a direction room (1) with a plurality of 
both remote and neighbour locations (2), which a signal 
of the audio video type (AV) originates at; 

- if necessary, conversion of the audiovisual signal 
(AV) from each location (2), before its transfer from 

10 the place where it was generated to that where the 
direction room is located (1), so as to make it 
suitable to the type of connection and transmission 
which are being utilised; 

- Reconversion of the signal (AV) which has been 
15 received, if this is necessary, into the audio video 

format, before its arrival at the direction room (1); 

- Selection of the signal or signals to use and send 
away to the attendants and the speaker respectively, by 
an input audio video matrix (MVl); 

20 - Addition of the contributions and the necessary audio 
and/or video effects, as well as of titles, 
soundtracks, comments, images, graphs and so on, by a 
video mixer or a computer having similar f unctions ; 

- Selection of the processed audio video signals (AV) 
25 and sending thereof to the several remote locations 

(2), according to the role that the users who are there 
located play at that moment (i.e. attendants or 
speakers) . 

2. Process according to claim 1, characterised by 
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the fact that while the attendants to the conference 
receive the audio video signal from the speaker^ the 
speaker receives a different audio video signal which 
has been selected at the direction room (1). 

5 

3. Process according to claim 2, characterised by 
the fact that the speaker receives an overview of the 
attendants (2), or of some of them, by the employment 
of a targeted or cyclical selection device (SR) that 

10 selects the desired signals from the signals that 
arrive from the several locations, to further forward 
them to the output audio-video matrix (MV2), for their 
subsequent delivery to the speaker; said signals (AV) 
being capable of being simultaneously combined. 

15 

4. Process according to the preceding claims, 
characterised by the fact that the speaker is shown the 
graph that he is talking about to the attendants on his 
own screen, the attendants receiving said graph as a 

20 superimposition or within a section of the image of the 
speaker himself or vice-versa. 

5. Process according to the preceding claims, 
characterised by the fact that it provides for the 

25 audio signal (A) from the speaker to be sent to the 
interpretation room (I) wherein a simultaneous 
translation into the languages required by the 
attendants is carried out; the signal which is sent to 
each attendant being therefore composed of the video 

30 signal (VI, V2,-.., Vn) selected for him, to which the 
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suitable audio signal (Al, A2 , . , . , An) has been 
associated, i.e. the one that corresponds to the 
translation required by the user. 

5 6. Process according to the preceding claims, 

characterised by the fact that more than one user can 
receive the same audio video signal (AV) . 

7. Process according to the preceding claims, 
10 characterised by the fact that it provides for the 
recording of the audio video signal for the purpose of 
archive or else, so well as it is actually seen by the 
attendants, that is enriched with the audiovisual 
contributions and the television effects that have been 
15 added, by a suitable videotape recorder {VD2) that 
receives the output signal of a video mixer (MIX) or 
computer with similar functions. 

8 . Apparatus for carrying out and managing 
20 videoconf erences between remote and/or neighbour users, 
characterised by the fact that it comprises a plurality 
of remote and/or neighbour user-locations (2) , of the 
interactive or multimedial type which are linked to a 
direction room (1) which exchanges a signal (AV) of the 
25 analog and/or digital audiovisual type with them. 

9. Apparatus according to claim 8, characterised 
by the fact that said signal (AV) contains a series of 
information relative to the conference and the speaker 
30 or the speakers that are scheduled to talk, as well as 
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Other auxiliary audiovisual information. 

10. Apparatus according to claims 8 and 9, 
characterised by the fact that said user-locations (2) 
comprise audiovisual input /output means; signal 
transmission between said locations and the direction 
room, and vice-versa, taking place regardless via 
(aggregate or not, analog and/or ISDN) telephone lines, 
tie lines, satellite transmission devices, data 
transmission networks (including Internet), and so on. 


11. Apparatus according to claims 8, 9 and 10, 
characterised by the fact that said remote locations 
(2) are equipped with analog/digital audiovisual signal 
15 conversion devices, said signal being then sent to the 
direction room (1) using suitable communication 
protocols according to the type of link which has been 
accomplished . 


30 


12. Apparatus according to claims 8,9,10 and 11, 
characterised by the fact that the direction room (1) 
simultaneously receives the respective signals (AV) 
coming from all the users (2) linked-up to the 
videoconference, tranforms them into audiovisual 
signals by dint of said conversion devices and singly 
visualises them on a series of monitors; said signals 
(AV) are then channeled into an audio video matrix 
(MVl) that makes it possible to send just the signals 
coming from the speaker or speakers to the video mixer 
(MIX) , in such a way that they are seen by all the 
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other attendants, with possible image fadings or other 
effects . 

13. Apparatus according to claims 8, 9, 10, 11 and 
12, characterised by the fact that the signals (AV) 
selected by means of the audio video matrix (MVl) are 
forwarded to a video mixer (MIX) , or a computer with 
similar functions, which is capable of interfacing with 
a number of appliances such as computers (PC) , video 
tape recorders (VDl) , cameras, titlers (T) , audio 
equipment, and so on; said video mixer (MIX) making it 
possible to add to or superimpose onto the 
videoconference signal, that is to that from the 
speaker, a series of audiovisual contributions such as 
titles, subtitles, musical themes or soundtracks, audio 
video fadings, slides and/or graphs, visualising them 
full screen or on a portion thereof. 

14. Apparatus according to claims from 8 to 13, 
characterised by the fact that it provides for the 
visualisation of the name of the speaker that is 
talking at a certain moment, for the carrying out of 
image superimpositions, for the use of special effects 
and/or whatever other type of audiovisual contribution 
that makes the conference more versatile and adaptable 
to the specific need of a certain moment; said 
apparatus further providing for the superimposition, 
the placing side by side or the creation of effects 
between the image of the speaker and of films backing 
up his talk, or of graphs that he himself is making or 
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changing at that very moment, and so on. 

15. Apparatus according to claims from 8 to 14, 
characterised by the fact that the audio video signal 

5 (AV) as it is processed by the video mixer (MIX) , or by 
a computer with similar functions, is sent to a second 
audio video matrix (MV2) , or an analogues audiovisual 
signal sorting-out device, that provides for the signal 
to be forwarded to each single user (2), regardless of 

10 whether they be remote or local. 

16. Apparatus according to claims from 8 to 15, 
characterised by the fact that the two input and output 
commutation devices of the direction room (MVl, MV2) 

15 ensure a total compatibility between different 
videocommunication systems, by said plurality of 
conversion devices, so as to provide for the 
transmission between equipments that belong to 
technological realitites that have so far been 

20 incompatible. 

17. Apparatus according to claims from 8 to 16, 
characterised by the fact that the audio signal (A) is 
captured before it reaches output audiovisual matrix 

25 (MV2), so as to make it possible to carry out a 
simultaneous translation by one or more interpreters 
into the language or languages of one or more users (2) 

that may require it . ... 

18. Apparatus according to claims from 8 to 17, 
30 characterised by the fact that the audio signal (A) 


21 


wo 99/63756 PCT/IT98/00149 


that is sen:: to an interpretation room (I) for the 
translation, is subsequently associated to the video 
signal (V) exiting the second audio video matrix (MV2) 
in real time, in such a way that the translation or the 
translations are respectively listened to just by all 
the users that make an explicit request for them. 

19. Apparatus according to claims from 8 to 18, 
characterised by the fact that the audio video signal 
(AV) as elaborated by the video mixer (MIX) , or by a 
computer with similar functions, is forwarded to a 
videotape recorder (VD2} that records the 
videoconf erence . 

20. Apparatus according to claims from 8 to 19, 
characterised by the fact that the direction (1) can 
take part in whatever moment, by replacing the audio 
video signal (AV) which is sent to one or more 
attendants (2), regardless of whether they be remote or 
local, with an audio video signal of its own (AVR) , 
accomplishing an "intercom" type communication while 
the users who are not interested keep following the 
vieoconf erence without any disruption or interference. 

21. Apparatus according to claims from 8 to 20, 
characterised by the fact that the signal (AV) which is 
elaborated by the direction room (1) , is of the audio- 
video type: therefore the incoming signals from it that 
are not in the audio video format must be transformed 
before their utilisation and possibly retransf ormed 
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into an analog or digital form at the moment of their 
forwarding to remote attendants; said input and output 
conversions at the direction room depend on the systems 
used and on the analog or digital features of the link- 
up, with each single remote user, which be accomplished 
by the means that the user believes to be most 
suitable: analog ISDN or aggregate ISDN telephone 
lines, tie-lines, satellite transmissions, computer 
networks (e.g. Internet), and so on. 

22. Apparatus according to claims from 8 to 21, 
characterised by the fact that all the attendants to 
the videoconference receive the audiovisual signal 
selected by the direction, of the person that is 
talking, while on the spekaer's screen there is found 
to be visualised the attendant to whom he is answering 
directly, or with whom he intends to discuss, or, in a 
so called cyclical fashion, all the attendants to the 
conference (one by one or in groups); for this purpose, 
the doubling of all the incoming signals (AV) being 
provided . 

23, Apparatus according to claim 22, characterised 
by the fact that said selection of the signal sent to 
the speaker is obtained by dint of a video matrix and a 
cyclical visualisation device, with the possibility of 
simultaneously combining more than one audiovisual 
sources, controlled by a tim.er-programmer or by a 
computer; the resulting signal being only sent to the 
speaker and/or some particular users, by the output 
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vodeo matrix (MV2), if the direction believes it 
necessary . 

24. Apparatus according to claims from 8 to 23, 
characterised by the fact that, according to schedule 
or else, the director can select the speaker who is 
scheduled to talk, who is bound to be visualised to all 
the other attendants to the conference and or 
spectators • 

25. Apparatus according to claims from 8 yo 24, 
characterised by the fact that keeping the audio 
channel active of all or part of the attendants to the 
conference (2), this makes it possible to automatically 
visualise the participants that take part temporarily 
and briefly, by the employment of windows or spots. 

26. Apparatus according to claims 8 to 25, 
characterised by the fact that thanks to suitable 
(aggregate or tie-line) link-ups between the direction 
room and an Internet Provider, it is possible to 
transmit the audiovisual signal (AV) of the 
videoconference, that comes from the outgoing audio 
video matrix (MV2) , to any Internet user. 

27. Apparatus according to claim 26, characterised 
by the fact that by a suitable discussion group, any 
single user can ask questions, show exarnples and 
actively take psrt in the debate; a chairperson being 
capable of visualising on his own monitor all the 
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communications between the final users or spectators by 
a computer, and of ascertaining whether to turn them to 
one of the speakers that can answer using the channels 
and modalities of the videoconf erence which have 
5 already been described. 

28. Apparatus according to claim 21, characterised 
by the fact that if the chairperson on the other hand 
believes it suitable to let an Internet user (UI) take 
part in the debate, the direction room (1) is capable 
of carrying out an unexpected but viable telephone 
link-up (AV-UI) turning the Internet user into an 
"actor" from being a "spectator", and offering him the 
possibility of getting to take part in the 
videoconf erence just in the same fashion as the other 
attendants who are already connected (with the proviso 
that the latecomer is sufficiently equipped for taking 
part in the videoconf erence with the modalities and 
features which were previuosly described) - 


10 


15 


20 


29. Apparatus according to claims 8 to 28, 
characterised by the fact that in case of an Internet 
connection, besides by normal switch or ISDN telephone 
lines, the link-up between remote user and provider can 
25 take place thanks to a mixed signal management system 
where the requests made by the user are transmitted to 
the provider by telephone, while the audio video signal 
relative to the videoconference or the data ^w^ 
been requested can be received via satellite, 
drastically augmenting the quality and the reception 
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speed, regardless of the traffic on the network and of 
the amount of users who are connected at that moment; 
it being further possible to carry out the transmission 
and the data file exchange whatever type they are, in a 
manner which is absolutely compatible with whatever 
type of computer or computer system. 

30. Apparatus according to the preceding claims, 
characterised by the fact that said remote or neighbour 
locations (2) can also comprise a camera and a 
microphone which are apt to forward the audiovisual 
signal that comes from an event, a parade, sports 
events or else, to the direction room (1), which is 
going to use it in the most suitable way. 

31. Apparatus according to the preceding claims, 
characterised by the fact that the connections between 
the several locations, whether they be remote or local, 
and the direction room, are managed by dint of the 
normal known link-up procedures that can be by means of 
a telephone line carrier , by direct phone calls, by 
Internet network, via satellite, tie-lines, and so on. 
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